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Abstract 



SD codes are erasure codes that address the mixed failure mode of current RAID 
systems. Rather than dedicate entire disks to erasure coding, as done in RAID-5, 
RAID-6 and Reed-Solomon coding, an SD code dedicates entire disks, plus individual 
sectors to erasure coding. The code then tolerates combinations of disk and sector 
errors, rather than solely disk errors. It is been an open problem to construct general 
codes that have the SD property, and previous work has relied on Monte Carlo searches. 
In this paper, we present two general constructions that address the cases with one disk 
and two sectors, and two disks and two sectors. Additionally, we make an observation 
about shortening SD codes that allows us to prune Monte Carlo searches. 

Keywords: Error-correcting codes, RAID architectures, MDS codes, array codes, 
Reed-Solomon codes, Blaum-Roth codes, PMDS codes, SD codes. 

1 Introduction 

The motivation and description of SD codes is presented in early work by Plank, Blaum and 
Hafner [5]. In this work, we assume that the reader has read that paper or the follow-on 
paper [6]. 

We use the following nomenclature to describe an SD code: 

• n: The total number of disks in a disk array. 

• m: The total number of disks dedicated to fault-tolerance. 

• s: The total number of additional sectors per stripe dedicated to fault-tolerance. 

• r: The total number of sectors per disk in a stripe. 

• GF{2^): The Galois Field which defines the arithmetic. 

• H: An (mr + s) x [nr) parity check matrix. 
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The parity check matrix has a specific format: 

• For row i < mr, the only non-zero elements are in columns (^^^ n through (^^^ (n+1) — 1. 
The fractions employ integer division. 

• For row mr < i < mr + s, all elements are non-zero. 

Each block in the stripe has a corresponding column of the parity check matrix. In par- 
ticular, block i of disk j corresponds to column ni + j. The code is SD if it tolerates any 
combination of m disk failures and s additional sector failures. 

General constructions of SD codes have been heretofore limited. Blaum, Hafner and Hetlzer 
have given constructions when s = 1 [2], and Blaum has presented a construction for m = 1 
and s = 2. In Section [2] of this paper, we present this code again, but with a simpler proof 
of the SD property. In Section [3l we present a construction for m = 2 and s = 2. Finally, in 
Section HI we make an observation on SD codes that allows us to prune searches for further 
constructions. 



2 Construction of an SD code with m = 1 and s = 2 

Here we repeat the construction given in [1], but we give a simpler proof. 

Consider the field GF{2^) and let a be an element in GF{2'^). The (multiplicative) order of 
a, denoted 0{a), is the minimum i, < i, such that = 1. If a is a primitive element [1], 
then 0{a) = 2^ — 1. To each element a G GF{2^), there is an associated (irreducible) 
minimal polynomial |1] that we denote fa{x)- 

Let a G GF{2^) and rn < 0{a). Consider the (r + 2) x rn parity-check matrix 



( Cq Qi ... Qn-i I C„ C„+i . . . C2„_i I • • • I C(r-l)n Q{r-l)n+l ■ ■ ■ Qrn-1 

where Cj denotes a column of length r + 2, and, if Cj denotes an r x 1 vector whose coordinates 
are zero except for coordinate i, which is 1, then, for < z < r — 1, 



/ e,- e,- ... e,- 



Qin^ £:in+li ■ ■ ■ )£(j+l)n— 1 



Q;*"+J . . . I (2) 



We denote as C(r, n, 1; /^(x)) the [rn, r{n — 1) — 2] code over GF{2^) whose parity-check 
matrix is given by ([T]) and 

Example 2.1 Consider the finite field 6*^(16) and let a be a primitive element, i.e., 0{a) = 15. 
Then, the parity-check matrix of C (3, 5, 1; fai^)) is given by 
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dy, the parity-check matrix of C (5, 3, 1 


faix)) is given by 
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Let us point out that the construction of this type of codes is vahd also over the ring 
of polynomials modulo Mp{x) = 1 + x + • ■ • + x^~i, p a prime number, as done with the 
Blaum-Roth (BR) codes [3]. In that case, 0{a) =p, where a^~i = 1 + a + • ■ • + a^~^. The 
construction proceeds similarly, and we denote it C(r, n, 1; Mp{x)). Utilizing the ring modulo 
Mp(x) allows for XOR operations at the encoding and the decoding without look-up tables 
in a finite field, which is advantageous in erasure decoding [3]. It is well known that Mp{x) 
is irreducible if and only if 2 is primitive in GF{p) [1]. 

Example 2.2 Consider the ring of polynomials modulo Mn{x) and let a be an element in 
the ring such that ai^ = 1 -|- a -|- • ■ • -|- ai^, thus, 0{a) = 17 (notice, Mn{x) is reducible). 
Then, the parity-check matrix of C(4,4, 1; Mij{x)) is given by 
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We have the following theorem: 

Theorem 2.1 Codes C(r, n, 1; fai^)) and C(r, n, 1; Mp{x)) are SD codes. 

Proof: We break the proof into two cases. In the first, the two sector errors occur on the 
same row of the stripe. In this case, we focus solely on three columns of the parity check 
matrix that share non-zero entries in one of the first r rows. Put another way, this will 
happen if and only if, for any < i < r — 1 and < jo < ji < j2 ^ n — 1, 
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det 



a 



y a'^''"'~^° a^**^"-^! 







But the determinant of this 3x3 matrix can be easily transformed into a Vandermonde 
determinant on a^°, a^^ and a^^ times a power of a, so it is invertible in a field and also in 
the ring of polynomials modulo Mp{x) [3]. 

In the second case, the two sector failures occur in different rows of the stripe. In this case, 
we must prove to prove that if we have two erasures in locations i and j of row i, and two 
erasures in locations i and j' of row £', 0<i,j,j'<n — 0<i<£'<r — 1, then 
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After some row manipulation, the inequality above holds if and only if 



Both 1 © and 1 © a-^'"* are invertible in GF{2^) since l<j — — i< 0{a), and the 
same is true in the polynomials modulo Mp{x) [5], thus, the inequahty above is satisfied if 
and only if 



aet I ^ ^2{e'-e)n+j-j' I - i © a 

Redefining — i and j ^ j' — j, we have 1 < i < r — 1 and — (n — 1) < i < n + 1. 

Thus, a^"^-' 7^ 1, since 

1 < in + j < {r - l)n + n - l = rn ~ 1 < 0{a) - 1. 

□ 



By Theorem 12. H the codes C(3, 5, 1; fa{x)) and C(5, 3, 1; fa{x)) in Example 12. II are SD, as 
well as code C(4, 4, 1; Mn{x)) in Example O 
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3 Construction of an SD code with m = 2 and s = 2 

Let a G GF{2^) and rn < 0{a). Consider the (2r + 2) x rn parity-check matrix 

( £0 Q.1 ■ ■ ■ Qn-l I Qn Qn+l ■ ■ ■ Q.2n-1 | ■ • • | C(r-l)n Q{r-l)n+l ■ ■ ■ Q.rn-1 ) (3) 

where Cin+j denotes a column of length 2r + 2, and, if e^n+j denotes a 2r x 1 vector whose 
coordinates are zero except for coordinates 2z and 2z + 1, which are 1 and respectively, 
then, for < i < r — 1, 



=;in)iim+l) • • • 5i:(i+l)n— 1 



^3in ^3m-l 



2(m+l) 



Q;2{m+j) 



a 



a 



-(i+l)n~l 
Sin— (n— 1) 

2((i+l)n-l) 



(4) 



We denote as C(r, n, 2; fai^)) the [rn,r(ri — 2) — 2] code over GF{q) whose parity-check 
matrix is given by ([3]) and (jl]). 
Let us illustrate the construction of C(r, n, 2; fa{x)) with an example. 

Example 3.1 As in Example 12. ![ consider the finite field GF{16) and let a be a primitive 
element, i.e., 0{a) = 15. Then, the parity-check matrix of C(3, 5, 2; fai^)) is given by 
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Similarly, the parity-check matrix of C(5, 3, 2; /Q-(a;)) is given by 
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Next we prove the following theorem: 

Theorem 3.1 Codes C{r, n, 2; /q/(x)) and C{r, n, 2; Mp{x)) are SD codes. 

Proof: Notice that 4 erasures in the same row will always be corrected. In effect, based 
on the parity-check matrix of the code, this will happen if and only if, for any < 2 < r — 1 

and < to < ^1 < ^2 < ^3 < ''T- - 1, 
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^ 



The determinant of this 4x4 matrix can be easily transformed into a Vandermonde 
determinant on a*^, q:*^ and a*^ times a power of a, so it is invertible in a field and also 
in the ring of polynomials modulo Mp{x) |3]. 

Thus, the code will be SD if and only if, given three erasures in locations i, j and t of row 
i, and three erasures in locations i, j and t' of row £' , < i < j < n — 1, < t,t' < n — 1, 
t,t' ^ i and t,t' ^ j, < i < i' < r - 1, then 
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After some row manipulation, the inequality above holds if and only if 



det 



/ a*(lea^-*) 








a"- {Lisa- - } a- {if^a- ^ 

a2{fc+i)(l^^^-t)2 «2{fe+t)^ig3^i-t)2 tt2(^'-+t')(l©tt-t')2 a2{£'"+*')(l®a^-t')2 y 



7^ 



if and only if, taking some common factors in columns and in the first two rows. 
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Some more row manipulation gives that this determinant is nonzero if and only if 
if and only if 

det ( ^„,(,.V^,_, "''j'"'') = l©a(^'-^)"+-V0. 

Redefining i^i' — i and t^t — t', we have 1 < £ < r — 1 and — (n — 1) < t < n — 1. 
Then, 1 © a^""*"* ^ reasoning as in Theorem 12.11 □ 

For instance, consider the finite field GF{256) and let a be a primitive element, i.e., 
C(a) =255. Then, by Theorem EH code C(51,5,2; /^(x)) is SD. 

4 Shortening the Codes in order to Prune Searches 

Observe the following: if we have an SD code consisting of r x n arrays, we consider the 
subcode of r x n arrays such that the last r — r' rows are zero, where r' < r. These arrays 
correspond to a shortening of the original code and are written simply as r' x ri arrays. 
Since shortening preserves the error- correcting properties of the code, the shortened code 
consisting of r' x tt, arrays is also SD. This observation allows us to prune the search as 
follows: in [6], we performed Monte Carlo searches to discover SD codes in cases where we 
didn't have constructions. Our methodology is to construct codes using random coefficients 
to generate a parity check matrix for given values of n, m, s and r in GF{T"). We then 
test to see if the code is SD. If it is, then we generate a parity check matrix for n, m, s and 
r + 1, and test it to see if the code is SD. We continue until either r reaches a threshold, or 
until the code is not SD. Because of the observation above regarding shortening, when we 
discover a code that is not SD, we are guaranteed that codes for higher values of r are not 
SD, and therefore we do not have to generate and test them. 
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